Scaling up Dynamic Topic Models

نویسندگان

  • Arnab Bhadury
  • Jianfei Chen
  • Jun Zhu
  • Shixia Liu
چکیده

Dynamic topic models (DTMs) are very effective in discovering topics and capturing their evolution trends in time series data. To do posterior inference of DTMs, existing methods are all batch algorithms that scan the full dataset before each update of the model and make inexact variational approximations with mean-field assumptions. Due to a lack of a more scalable inference algorithm, despite the usefulness, DTMs have not captured large topic dynamics. This paper fills this research void, and presents a fast and parallelizable inference algorithm using Gibbs Sampling with Stochastic Gradient Langevin Dynamics that does not make any unwarranted assumptions. We also present a Metropolis-Hastings based O(1) sampler for topic assignments for each word token. In a distributed environment, our algorithm requires very little communication between workers during sampling (almost embarrassingly parallel) and scales up to large-scale applications. We are able to learn the largest Dynamic Topic Model to our knowledge, and learned the dynamics of 1,000 topics from 2.6 million documents in less than half an hour, and our empirical results show that our algorithm is not only orders of magnitude faster than the baselines but also achieves lower perplexity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new 2D block ordering system for wavelet-based multi-resolution up-scaling

A complete and accurate analysis of the complex spatial structure of heterogeneous hydrocarbon reservoirs requires detailed geological models, i.e. fine resolution models. Due to the high computational cost of simulating such models, single resolution up-scaling techniques are commonly used to reduce the volume of the simulated models at the expense of losing the precision. Several multi-scale ...

متن کامل

Scaling production and improving efficiency in DEA: an interactive approach

DEA models help a DMU to detect its (in-)efficiency and to improve activities, if necessary. Efficiency is only one economic aim for a decision-maker; however, up- or downsizing might be a second one. Improving efficiency is the main topic in DEA; the long-term strategy towards the right production size should attract our attention as well. Not always the management of a DMU primarily focuses o...

متن کامل

Generalizing and Scaling up Dynamic Topic Models via Inducing Point Variational Inference

Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that topics change continuously over time and therefore impose continuous stochastic process priors on their model parameters. In this paper, we extend the class of tractable priors from Wiener processes to the generic class of Gaussian processes (GPs)....

متن کامل

Dynamic Scaling Phenomena in Growth Processes

Inhomogeneities in a deposition process may lead to formation of rough surfaces. Fluctuations in the height h(x, t), of the surface (at location x and time t) can be probed directly by scanning microscopy, or indirectly by scattering. Analytical or numerical treatments of simple growth models suggest that, quite generally, the height fluctuations have a self-similar character; their average cor...

متن کامل

On the Use of Microarchitecture-Driven Dynamic Voltage Scaling

This paper proposes microarchitecture-driven dynamic voltage scaling as a viable solution to power efficient architectures, with little or no performance penalty. The run-time behavior exhibited by common applications, with active periods, alternated with stall periods due to cache misses, is exploited to reduce the dynamic component of power consumption via selective voltage scaling. As it is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016